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Abstract: This study demonstrates an application of cluster analysis to constant ambient air 
monitoring data of four pollutants in the Kanto region: NO*, photochemical oxidant (0 A ), 
suspended particulate matter, and non-methane hydrocarbons. Constant ambient air 
monitoring can provide important information about the surrounding atmospheric pollution. 
However, at the same time, ambient air monitoring can place a significant financial burden 
on some autonomous communities. Thus, it has been necessary to reduce both the number of 
monitoring stations and the number of chemicals monitored. To achieve this, it is necessary 
to identify those monitoring stations and pollutants that are least significant, 
while minimizing the loss of data quality and mitigating the effects on the determination of 
any spatial and temporal trends of the pollutants. Through employing cluster analysis, 
it was established that the ambient monitoring stations in the Kanto region could be clustered 
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topologically for NO* and O x into eight groups. From the results of this analysis, 
it was possible to identify the similarities in site characteristics and pollutant behaviors. 

Keywords: cluster analysis; constant ambient air monitoring; Kanto region; NO x ; O x ; 
non-methane hydrocarbon; suspended particulate matter 



1. Introduction 

Constant ambient air monitoring can provide important information about surrounding atmospheric 
pollution. In Japan, constant ambient air monitoring of six priority substances (SO2, carbon monoxide 
(CO), NO x , photochemical oxidant (O x ), suspended particulate matter (SPM), and non-methane 
hydrocarbons (NMHC)) is conducted by local prefectural governments under the Air Pollution 
Control Act. These constant ambient air monitoring data are very useful for analyzing the current 
situation and trends of pollution within an area, and many research works in Japan that have used these 
data have been reported [1-3]. However, at the same time, ambient air monitoring places a significant 
financial burden on the autonomous communities. Thus, it is necessary to identify less significant 
monitoring stations and pollutants, while minimizing the loss of data quality and mitigating the effects 
on the determination of any spatial and temporal trends of the pollutants. There have been some trials 
to re-examine the efficacy of constant ambient air monitoring stations, for example, in the cities of 
Shizuoka, Funabashi, and Hiroshima Prefecture in Japan. However, currently, no reliable guidelines 
exist regarding the optimal method by which this could be achieved. 

In this study, we applied cluster analysis to constant ambient air monitoring data from the 
Kanto region of Japan, based on the expectation that similarities in site characteristics and pollutant 
behaviors could be identified, and that monitoring stations could be grouped topologically. 

Cluster analysis [4] and principal components analysis have been commonly used in air pollution 
studies. In particular, intensive cluster analyses for ozone (O3) and particulate matter (PM) pollution 
have been conducted. Lavecchia et al. [5] applied a cluster analysis to Italian ozone monitoring network 
data and discussed the similarities between the data of each monitoring station. Gramsch et al. [6] 
applied the same approach to O3 and PM10 concentrations and demonstrated that these two pollutants 
had similar cluster patterns, suggesting that these pollutants' concentrations were controlled by 
meteorological and topographical factors. Lu et al. [7] applied three different cluster analyses to 
PM10 pollution in Taiwan. Giri et al. [8] applied a hierarchical cluster analysis to seasonal PM10 data in 
Kathmandu, Nepal, and demonstrated that monsoon rainfall had only a limited effect on decreasing 
PM10 concentrations. Cluster analysis has also been applied to other pollutants, such as nitrogen oxides 
(NO*) [9,10], carbon dioxide (C0 2 ) [11], sulfur dioxide (S0 2 ) [9], pollen [12], and Pb [13]. 

We applied cluster analysis to constant ambient air monitoring data obtained in 1996 and 2006 in 
the Kanto region, which includes the capital region of Japan (Figure 1). By employing cluster analysis, 
ambient monitoring stations could be clustered topologically for NO* and O x . Based on the results of 
the analysis, suggestions for reducing both the number of monitoring stations and the number of 
chemicals monitored are possible. 
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2. Method 

2.1. Air Monitoring Data 

This study focused on the Kanto region in Japan, including the seven prefectures of Tokyo, Gunma, 
Tochigi, Ibaraki, Chiba, Saitama, and Kanagawa. The Kanto region has approximately 500 ambient air 
monitoring stations, which account for one quarter of all the monitoring stations in Japan. The constant 
ambient air monitoring data obtained by 476 monitoring stations during the fiscal years of 1996 and 
2006 were used in this study. The monitoring data were kindly provided by the National Institute for 
Environmental Studies of Japan, and the data for the Kanagawa Prefecture were obtained from the 
prefecture's website. The priority pollutants in this study were NO T , O x , NMHC, and SPM. 
SO2 and CO were excluded because their concentrations in the area of interest were very low and did 
not show large spatial differences. There are two types of ambient air monitoring stations in Japan: 
general environmental air monitoring stations and vehicle emission monitoring stations [14]. 
The latter are arranged near large roadways to monitor air pollution caused by vehicle emissions. 
Monitoring data from both types of station were used for the cluster analysis. 

2.2. Cluster Analysis 

SPSS® Statistics 17.0 (SPSS Inc., Chicago, IL, U.S.) was used for the cluster analysis. The data 
matrices for each pollutant were prepared for cluster analysis using data from the air monitoring 
stations. In each matrix, the element in the zth row and jth column stands for the zth measurement in the 
year from the y'th air monitoring station. Missing data in the y'th column were interpolated from the 
annual average value of the jth monitoring station. The percentages of missing data ranged from 1 .42% 
to 7.84%. The cluster number was fixed at eight by considering the total number of both air monitoring 
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stations and prefecture numbers in the Kanto region. The square Euclidean distance technique and the 
Ward method were adopted for the cluster analysis. 

2.3. Concentration Contour Maps 

Concentration contour maps were used to clarify the characteristics of each cluster. A concentration 
contour map is a contour of pollutant concentrations in which the monitoring month is presented as the 
abscissa and the monitoring time as the ordinate. Tarasova et al. [15] and Zvyagintsev et al. [16] 
have used this method for air pollution analysis. Using these maps, the seasonal and diurnal variations 
of pollutant concentrations can be examined visually. 

3. Results and Discussion 

3.1. NO x 

3.1.1. General Environmental Air Monitoring Stations 

For NO*, the concentration data from the general environmental air monitoring stations and vehicle 
emission monitoring stations were treated separately. The general environmental air monitoring 
stations in the Kanto region were clustered into eight groups for NO* monitoring in both fiscal years of 
1996 and 2006 (Figure 2). In the legend of Figure 2, the annual average NO* concentrations for each 
cluster are shown. The annual average NO* concentration in the Kanto region in 1996 was 39.4 ppb, 
which decreased to 26.7 ppb in 2006. The annual average values were calculated by averaging all 
measured NO* values in the year. This concentration decrease was most likely because of the 
Automobile NO x PM Control Law of Japan established in 2006. By application of cluster analysis to 
the NO* data from the general environmental air monitoring stations, these monitoring stations were 
clustered territorially. Furthermore, the territorial grouping of the monitoring stations was retained. 
For example, cluster 7 in 1996, which shows the highest averaged NO* concentration (64.5 ppb), 
also appears as cluster 7 in 2006 in the same region, also with the highest NO v concentration (42.0 ppb). 
Clusters 4 and 5 in 1996 appear as clusters 5 and 6 in 2006 in the same region. These results show the 
possibility for the reasonable elimination of NO x monitoring in the general environmental air 
monitoring stations in such clusters, because it can be said that the air pollution by NO x in these 
regions is similar and that the pollution situation has not changed during the 10-year interval. 
Clusters with the highest average NO* concentrations, such as clusters 6 and 7 in 1996, were located in 
the Tokyo, Kanagawa, and Chiba prefectures, where traffic volumes were highest. 

Figure 3 presents NO* concentration contour maps for these two clusters. While the absolute 
concentrations are different for each cluster, all eight clusters show similar seasonal and diurnal 
concentration variations with two diurnal peaks at around 08:00 and 20:00, and a seasonal peak during 
the winter. The NO x concentration contour maps in 1996 show a similar trend to those in 2006 
(data not shown). The diurnal variation in NO x concentrations can be explained partially by variations 
in traffic volume. However, the traffic volume on national roads in the Kanto region begins to increase 
during the early morning at around 04:00 and remains almost constant from 07:00 to 18:00 [17]. 
Therefore, variations in traffic volume cannot explain fully the decreasing NO Y concentrations during 
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the daytime. Other factors, such as NO r elimination by convection or photochemical reactions during 
the daytime, should also be considered in explaining the diurnal variations in NO* concentrations. 
The seasonal variation in NO* concentrations can be explained by the development of 
ground-based inversion layers during the winter, which effectively trap NO x near the ground surface. 
Furthermore, the strength of ultraviolet light is weaker during the winter, increasing the lifetime of 
NO* because of the low concentrations of hydroxyl radicals. 



Figure 2. Results of clustering using NO Y monitoring data from the general environmental 
air monitoring stations: (A) in 1996 and (B) in 2006. 




(A) NO x , 1 996 (B) NO. r , 2006 



Figure 3. NO v concentration contour maps for clusters 6 (A) and 7 (B) in 2006. 
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3.1.2. Vehicle Emission Monitoring Stations 

Vehicle emission monitoring stations in the Kanto region are not well classified into territorial 
groups for NO* monitoring in 1996 and 2006 (Figure 4), which is reasonable because the monitoring 
stations were installed near large roads to monitor regional air pollution from vehicular emissions. 
Very small clusters containing less than 10 air monitoring stations were identified. The air monitoring 
stations included in such small clusters show unusual NO A concentration variations. For example, 
the vehicle emission monitoring station at Matsubarabashi on the circular Route Kanjo 7 comprised 
one cluster on its own in 2006. Vehicle emissions make this road one of Japan's most polluted areas. 
In this area, the Ministry of Land, Infrastructure and Transport of Japan and the Tokyo Metropolitan 
Government conducted air clarification experiments from 2003 to 2005. It is interesting that this highly 
polluted area could be extracted automatically during the cluster analysis. During the decade, 
the annual average NO Y concentration at the vehicle emission monitoring stations also decreased 
from 96.3 ppb in 1996 to 63.1 ppb in 2006. 

Figure 4 Results of clustering using NO x monitoring data from the vehicle emission 
monitoring stations: (A) in 1996 and (B) in 2006. 
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3.2. O x 

Figure 5 presents the clustering results for air monitoring stations using 0. r concentrations. 
O x is defined as oxidative chemicals (limited to those that can generate iodine from neutralized 
potassium iodide solution, except for NO2) derived from photochemical reactions from ozone, 
peroxyacetyl nitrate, and others by the Ministry of the Environment of Japan. Similar to the case for 
NO.v measured by the general environmental air monitoring stations, air monitoring stations in the 
Kanto region were clustered well territorially for 0*. Despite some elimination and consolidation, 
the territorial group was preserved after 10 years. These results show the possibility for the reasonable 
elimination of O x monitoring in the air monitoring stations. 
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Variations in the annual average Ox concentration in each area were small, and it is speculated that 
the clusters were grouped by concentration variations or other factors (e.g., time variation trends), 
rather than by absolute concentrations. The annual average O r concentrations in the Kanto region 
in 1996 and 2006 were 23.6 and 25.1 ppb, respectively. 



Figure 5. Results of clustering using O x monitoring data: (A) in 1996 and (B) in 2006. 




(A) O x , 1996 (B) O x , 2006 



Figure 6 shows the concentration contour maps for clusters 2 and 4 in 2006. In 2006, cluster 2 had 
the highest annual average O x concentration, but much higher O x concentrations exceeding the 
environmental standards in Japan (60 ppb) were not observed regularly. The high annual average 0 T 
concentration in the cluster was attributed to the higher 0 T concentrations measured during the winter 
compared with other clusters. In 2006, cluster 4 had the lowest annual average O x concentration. 



Figure 6. Ox concentration contour maps for clusters 2 (A) and 4 (B) in 2006. 
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(A) cluster 2, 2006 



(B) cluster 4, 2006 
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In cluster 3, in 2006, O x concentrations higher than the environmental standards (60 ppb) were 
observed more frequently, despite the cluster having an annual average concentration that could be 
considered normal. Cluster 3 was located in Saitama Prefecture, but also encompassed parts of the 
Tochigi and Ibaraki prefectures. 

3.3. NMHC 

Figure 7 presents the cluster analysis results using NMHC concentrations. In 1996, the Kanto region 
was divided approximately into two areas: an urban area and a surrounding suburban area. 
However, the division is not obvious because NMHC concentrations are affected by local pollution 
sources, similar to the clustering of NO v concentrations at the vehicle emission monitoring stations. 
The terrestrial division became more ambiguous in 2006. It is evident that NMHC pollution changed in 
the East Kanto area over the decade. These results indicate that it is difficult to eliminate monitoring 
stations for NMHC by the cluster method. The annual average NMHC concentration in the Kanto 
region decreased from 331.6 ppb in 1996 to 233.9 ppb in 2006. The percentage of vehicle emission 
monitoring stations was especially high in clusters 5 and 8, where the average annual NMHC 
concentrations were high. This suggests that vehicle emissions influence NMHC pollution within 
the region. 



Figure 7. Results of clustering using NMHC monitoring data: (A) in 1996 and (B) in 2006. 




(A) NMHC, 1996 (B) NMHC, 2006 



Figure 8 shows the NMHC concentration contour map for cluster 5 in 2006, which experienced 
the highest average concentration (388 ppb). Similar to other contour maps, two peaks of NMHC 
concentration can be observed in many areas during the summer and winter. One possible reason for 
the higher NMHC concentration in the summer season could be the enhanced emissions and atmospheric 
chemical reactions under higher temperatures. Peak NMHC concentrations during the winter can be 
explained by the development of ground-level inversion layers trapping NMHC emissions. 
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3.4. SPM 

Figure 9 shows the cluster analysis results of air monitoring stations using SPM concentrations. 
In the central urban area, some overlaps of clusters were observed. Overall, the air monitoring stations 
in the Kanto region were clustered into eight territorial groups for both years. Some differences among 
the clustering for both years were observed. Clusters 2 and 3 in 1996, in the north section of Chiba, 
merged into one cluster (cluster 2) in 2006. Cluster 5 in 1996, split into three groups: clusters 3, 4, and 6. 
These results indicate that the air pollution situation changed in the Kanto region during the decade. 
In addition, it can be said that it is difficult to eliminate monitoring stations for SPM using the cluster 
method. The annual average SPM concentration in the Kanto region in 1996 was 45.5 ug/m 3 , 
which decreased to 29.3 ug/m 3 in 2006. This large decrease in concentration was mainly caused by the 
implementation of regulation according to the Automobile NO T PM Control Law of Japan. 
The banning of small incinerators to prevent dioxin emissions also contributed to the decrease in SPM 
concentrations. The annual average SPM concentrations were highest in clusters 2, 5, and 6. 
The percentages of vehicle emission monitoring stations in these clusters were high. Similar to NO r , 
if the monitoring data from the vehicle emission monitoring stations were treated separately, 
the SPM pollution characteristics in the Kanto region would be clearer. 

Figure 10 presents the SPM concentration contour maps for clusters 5 and 4 in 2006. In 2006, 
SPM pollution during the summer was prominent in all eight clusters. The SPM concentration contour 
map for clusters 4 and 5 showed the typical concentration variation trend observed in the Kanto region; 
a concentration peak in the morning from June to July (the rainy season) and a concentration peak in 
the evening in December. High concentrations of SPM during the summer can be explained by 
accelerated photochemical reactions. Kaneyasu et al. [18] reported that high SPM concentrations 
during the rainy season in Japan could also be caused by meteorological factors. In 2006, cluster 4, 
located in the northern area of Saitama to the center of Gunma, showed specific SPM concentration 
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variations, i.e., they increased in the evening during the summer. In this area, southeasterly winds are 
dominant during the summer, suggesting that winds transported NO A and NMHC that were emitted in 
the urban area during the day. Photochemical reactions occurring during their transport would 
contribute to the elevated SPM concentrations in cluster 4 in the evening. 



Figure 9. Results of clustering using SPM monitoring data: (A) in 1996 and (B) in 2006. 




(A) SPM, 1996 (B) SPM, 2006 



Figure 10. SPM concentration contour maps for clusters 5 (A) and 4 (B) in 2006. 
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4. Conclusions 

Ambient air monitoring stations in the Kanto region were clustered into eight groups using constant 
air monitoring concentrations of NO*, O*, NMHC, and SPM. The air pollution characteristics of the 
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clusters were analyzed using concentration contour maps. It was confirmed that the ambient 
monitoring stations could be clustered topologically for NO x and O x using cluster analysis. 
If the ambient air monitoring stations could be reasonably grouped, then a method for reducing the 
number of both monitoring stations and chemicals monitored should be possible. Such a method 
should be simple, versatile, and mechanical and thus, we suggest that the number of monitoring 
stations in the Kanto region could be reduced by adopting the following three simple criteria: 
(1) retain the monitoring station (or chemical) if similarities exist between its monitored data and the 
averaged monitored data of the cluster to which it belongs; (2) retain the monitoring station 
(or chemical) if the monitored data show higher concentrations; and (3) retain the monitoring station 
(or chemical) if the monitored concentration levels exhibit an increasing trend. For the first criterion, 
Euclidean distances between each element of monitored data and the average of the monitored data in 
the topological group matrix were calculated, and only the top 5%-15% of monitoring stations with 
the smallest Euclidean distances was retained. For the second criterion, the top 5%-15% of monitoring 
stations with the highest annual averaged concentrations in 1996 or 2006 was retained. For the third 
criterion, the top 5%-15% of monitoring stations with the highest ratio of annual averaged 
concentrations in 2006 to 1996 was retained. The retention ratio for each criterion was varied within 
the range of 5%-15%. When the retention ratio was set at 10%, over 30% of monitoring stations could 
be removed by adopting the above criteria. In our next paper, we will describe this suggested 
method in greater detail. 
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